Generalize ereport_startup_progress infrastructure

  • Jump to comment-1
    bharath.rupireddyforpostgres@gmail.com2022-08-02T07:25:01+00:00
    Hi, ereport_startup_progress infrastructure added by commit 9ce346e [1] will be super-useful for reporting progress of any long-running server operations, not just the startup process operations. For instance, postmaster can use it for reporting progress of temp file and temp relation file removals [2], checkpointer can use it for reporting progress of snapshot or mapping file processing or even WAL file processing and so on. And I'm sure there can be many places in the code where we have while or for loops which can, at times, take a long time to finish and having a log message there would definitely help. Here's an attempt to generalize the ereport_startup_progress infrastructure. The attached v1 patch places the code in elog.c/.h, renames associated functions and variables, something like ereport_startup_progress to ereport_progress, log_startup_progress_interval to log_progress_report_interval and so on. Thoughts? Thanks Robert for an offlist chat. [1] commit 9ce346eabf350a130bba46be3f8c50ba28506969 Author: Robert Haas <rhaas@postgresql.org> Date: Mon Oct 25 11:51:57 2021 -0400 Report progress of startup operations that take a long time. [2] https://www.postgresql.org/message-id/CALj2ACWeUFhhnDJKm6R5YxCsF4K7aB2pmRMvqP0BVTxdyce3EA%40mail.gmail.com -- Bharath Rupireddy RDS Open Source Databases: https://aws.amazon.com/
    • Jump to comment-1
      robertmhaas@gmail.com2022-08-02T18:40:56+00:00
      On Tue, Aug 2, 2022 at 3:25 AM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > ereport_startup_progress infrastructure added by commit 9ce346e [1] > will be super-useful for reporting progress of any long-running server > operations, not just the startup process operations. For instance, > postmaster can use it for reporting progress of temp file and temp > relation file removals [2], checkpointer can use it for reporting > progress of snapshot or mapping file processing or even WAL file > processing and so on. And I'm sure there can be many places in the > code where we have while or for loops which can, at times, take a long > time to finish and having a log message there would definitely help. > > Here's an attempt to generalize the ereport_startup_progress > infrastructure. The attached v1 patch places the code in elog.c/.h, > renames associated functions and variables, something like > ereport_startup_progress to ereport_progress, > log_startup_progress_interval to log_progress_report_interval and so > on. I'm not averse to reusing this infrastructure in other places, but I doubt we'd want all of those places to be controlled by a single GUC, especially because that GUC is also the on/off switch for the feature. -- Robert Haas EDB: http://www.enterprisedb.com
      • Jump to comment-1
        bharath.rupireddyforpostgres@gmail.com2022-08-04T04:27:48+00:00
        On Wed, Aug 3, 2022 at 12:11 AM Robert Haas <robertmhaas@gmail.com> wrote: > > On Tue, Aug 2, 2022 at 3:25 AM Bharath Rupireddy > <bharath.rupireddyforpostgres@gmail.com> wrote: > > ereport_startup_progress infrastructure added by commit 9ce346e [1] > > will be super-useful for reporting progress of any long-running server > > operations, not just the startup process operations. For instance, > > postmaster can use it for reporting progress of temp file and temp > > relation file removals [2], checkpointer can use it for reporting > > progress of snapshot or mapping file processing or even WAL file > > processing and so on. And I'm sure there can be many places in the > > code where we have while or for loops which can, at times, take a long > > time to finish and having a log message there would definitely help. > > > > Here's an attempt to generalize the ereport_startup_progress > > infrastructure. The attached v1 patch places the code in elog.c/.h, > > renames associated functions and variables, something like > > ereport_startup_progress to ereport_progress, > > log_startup_progress_interval to log_progress_report_interval and so > > on. > > I'm not averse to reusing this infrastructure in other places, but I > doubt we'd want all of those places to be controlled by a single GUC, > especially because that GUC is also the on/off switch for the feature. Thanks Robert! How about we tweak the function a bit - begin_progress_report_phase(int timeout), so that each process can use their own timeout interval? In this case, do we want to retain log_startup_progress_interval as-is specific to the startup process? If yes, other processes might come up with their own GUCs (if they don't want to use hard-coded timeouts) similar to log_startup_progress_interval, which isn't the right way IMO. I think the notion of ereport_progress feature being disabled when the timeout is 0, makes sense to me at least. On the flip side, what if we just have a single GUC log_progress_report_interval (as proposed in the v1 patch)? Do we ever want different processes to emit progress report messages at different frequencies? Well, I can think of the startup process during standby recovery needing to emit recovery progress report messages at a much lower frequency than the startup process during the crash recovery. Again, controlling the frequencies with different GUCs isn't the way forward. But we can do something like: process 1 emits messages with a frequency of 2*log_progress_report_interval, process 2 with a frequency 4*log_progress_report_interval and so on without needing additional GUCs. Thoughts? -- Bharath Rupireddy RDS Open Source Databases: https://aws.amazon.com/rds/postgresql/
        • Jump to comment-1
          bharath.rupireddyforpostgres@gmail.com2022-08-08T04:29:09+00:00
          On Thu, Aug 4, 2022 at 9:57 AM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > On Wed, Aug 3, 2022 at 12:11 AM Robert Haas <robertmhaas@gmail.com> wrote: > > > > On Tue, Aug 2, 2022 at 3:25 AM Bharath Rupireddy > > <bharath.rupireddyforpostgres@gmail.com> wrote: > > > ereport_startup_progress infrastructure added by commit 9ce346e [1] > > > will be super-useful for reporting progress of any long-running server > > > operations, not just the startup process operations. For instance, > > > postmaster can use it for reporting progress of temp file and temp > > > relation file removals [2], checkpointer can use it for reporting > > > progress of snapshot or mapping file processing or even WAL file > > > processing and so on. And I'm sure there can be many places in the > > > code where we have while or for loops which can, at times, take a long > > > time to finish and having a log message there would definitely help. > > > > > > Here's an attempt to generalize the ereport_startup_progress > > > infrastructure. The attached v1 patch places the code in elog.c/.h, > > > renames associated functions and variables, something like > > > ereport_startup_progress to ereport_progress, > > > log_startup_progress_interval to log_progress_report_interval and so > > > on. > > > > I'm not averse to reusing this infrastructure in other places, but I > > doubt we'd want all of those places to be controlled by a single GUC, > > especially because that GUC is also the on/off switch for the feature. > > Thanks Robert! How about we tweak the function a bit - > begin_progress_report_phase(int timeout), so that each process can use > their own timeout interval? In this case, do we want to retain > log_startup_progress_interval as-is specific to the startup process? > If yes, other processes might come up with their own GUCs (if they > don't want to use hard-coded timeouts) similar to > log_startup_progress_interval, which isn't the right way IMO. > > I think the notion of ereport_progress feature being disabled when the > timeout is 0, makes sense to me at least. > > On the flip side, what if we just have a single GUC > log_progress_report_interval (as proposed in the v1 patch)? Do we ever > want different processes to emit progress report messages at different > frequencies? Well, I can think of the startup process during standby > recovery needing to emit recovery progress report messages at a much > lower frequency than the startup process during the crash recovery. > Again, controlling the frequencies with different GUCs isn't the way > forward. But we can do something like: process 1 emits messages with a > frequency of 2*log_progress_report_interval, process 2 with a > frequency 4*log_progress_report_interval and so on without needing > additional GUCs. > > Thoughts? Here's v2 patch, passing progress report interval as an input to begin_progress_report_phase() so that the processes can use their own intervals(hard-coded or GUC) if they wish to not use the generic GUC log_progress_report_interval. Thoughts? -- Bharath Rupireddy RDS Open Source Databases: https://aws.amazon.com/rds/postgresql/
          • Jump to comment-1
            robertmhaas@gmail.com2022-08-09T12:35:24+00:00
            On Mon, Aug 8, 2022 at 12:29 AM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > Here's v2 patch, passing progress report interval as an input to > begin_progress_report_phase() so that the processes can use their own > intervals(hard-coded or GUC) if they wish to not use the generic GUC > log_progress_report_interval. I don't think we should rename the GUC to be more generic. I like it the way that it is. I also think you should extend this patch series with 1 or 2 additional patches showing where else you think we should be using this infrastructure. If no such places exist, this is pointless. -- Robert Haas EDB: http://www.enterprisedb.com
      • Jump to comment-1
        bdrouvot@amazon.com2022-08-03T11:49:37+00:00
        Hi, On 8/2/22 8:40 PM, Robert Haas wrote: > CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. > > > > On Tue, Aug 2, 2022 at 3:25 AM Bharath Rupireddy > <bharath.rupireddyforpostgres@gmail.com> wrote: >> ereport_startup_progress infrastructure added by commit 9ce346e [1] >> will be super-useful for reporting progress of any long-running server >> operations, not just the startup process operations. For instance, >> postmaster can use it for reporting progress of temp file and temp >> relation file removals [2], checkpointer can use it for reporting >> progress of snapshot or mapping file processing or even WAL file >> processing and so on. And I'm sure there can be many places in the >> code where we have while or for loops which can, at times, take a long >> time to finish and having a log message there would definitely help. >> >> Here's an attempt to generalize the ereport_startup_progress >> infrastructure. The attached v1 patch places the code in elog.c/.h, >> renames associated functions and variables, something like >> ereport_startup_progress to ereport_progress, >> log_startup_progress_interval to log_progress_report_interval and so >> on. > I'm not averse to reusing this infrastructure in other places, but I > doubt we'd want all of those places to be controlled by a single GUC, > especially because that GUC is also the on/off switch for the feature. +1 on the idea to generalize this infrastructure in other places. I also doubt about having one single GUC to control all the places: What about adding in the patch the calls to the new API where you think it could be useful too? (and in the same time make use of dedicated GUC(s) where it makes sense?) Regards, -- Bertrand Drouvot Amazon Web Services: https://aws.amazon.com